Harvesting Relational and Structured Knowledge for Ontology Building in the WPro Architecture

نویسندگان

  • Daniele Bagni
  • Marco Cappella
  • Maria Teresa Pazienza
  • Marco Pennacchiotti
  • Armando Stellato
چکیده

We present two algorithms for supporting semi-automatic ontology building, integrated in WPro, a new architecture for ontology learning from Web documents. The first algorithm automatically extracts ontological entities from tables, by using specific heuristics and WordNet-based analysis. The second algorithm harvests semantic relations from unstructured texts using Natural Language Processing techniques. The integration in WPro allows a friendly interaction with the user for validating and modifying the extracted knowledge, and for uploading it into an existing ontology. Both algorithms show promising performance in the extraction process, and offer a practical means to speed-up the overall ontology building process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Web Harvesting

DEFINITION Web harvesting describes the process of gathering and integrating data from various heterogeneous web sources. Necessary input is an appropriate knowledge representation of the domain of interest (e.g. an ontology), together with example instances of concepts or relationships (seed knowledge). Output is structured data (e.g. in the form of a relational database) that is gathered from...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Thermal Behavior of Double Skin Facade in Terms of Energy Consumption in the Climate of North of Iran-Rasht

Industrialization and increasing demand for the consumption of fossil fuels cause that energy becomes a strategic factor. Energy crisis and the emergence of modern architecture led designers to pay more attention to the important task of building's envelope. Building skins play an important role in building thermal behavior and reduce energy consumption. If Double Skin Facades properly designed...

متن کامل

توسعه هستانشناسی فرایندمحور برای فناوریهای مدیریت دانش

This paper is an attempt to develop a new ontology for knowledge management (KM) technologies, determining the relationships between these technologies and classification of them. The study applies NOY methodology. Protégé software and OWL language are used for building the ontology. The presented ontology is evaluated with abbreviation and consistency criteria and knowledge retrieval of KM tec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007